Dataset statistics
| Number of variables | 17 |
|---|---|
| Number of observations | 2032002 |
| Missing cells | 2712023 |
| Missing cells (%) | 7.9% |
| Duplicate rows | 6 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 263.6 MiB |
| Average record size in memory | 136.0 B |
Variable types
| Categorical | 15 |
|---|---|
| Numeric | 2 |
ΓêÒΓòùΓõÉRegiao - Sigla has constant value "SE" | Constant |
Estado - Sigla has constant value "SP" | Constant |
Produto has constant value "ETANOL" | Constant |
Unidade de Medida has constant value "R$ / litro" | Constant |
| Dataset has 6 (< 0.1%) duplicate rows | Duplicates |
Municipio has a high cardinality: 127 distinct values | High cardinality |
Revenda has a high cardinality: 9618 distinct values | High cardinality |
CNPJ da Revenda has a high cardinality: 9925 distinct values | High cardinality |
Nome da Rua has a high cardinality: 5284 distinct values | High cardinality |
Numero Rua has a high cardinality: 3137 distinct values | High cardinality |
Complemento has a high cardinality: 747 distinct values | High cardinality |
Bairro has a high cardinality: 3652 distinct values | High cardinality |
Cep has a high cardinality: 5609 distinct values | High cardinality |
Data da Coleta has a high cardinality: 3691 distinct values | High cardinality |
Bandeira has a high cardinality: 142 distinct values | High cardinality |
Valor de Venda is highly correlated with Semestre and 1 other fields | High correlation |
Valor de Compra is highly correlated with Semestre and 1 other fields | High correlation |
Estado - Sigla is highly correlated with ΓêÒΓòùΓõÉRegiao - Sigla and 3 other fields | High correlation |
ΓêÒΓòùΓõÉRegiao - Sigla is highly correlated with Estado - Sigla and 3 other fields | High correlation |
Produto is highly correlated with Estado - Sigla and 3 other fields | High correlation |
Semestre is highly correlated with Valor de Venda and 1 other fields | High correlation |
Unidade de Medida is highly correlated with Estado - Sigla and 3 other fields | High correlation |
Complemento has 1741308 (85.7%) missing values | Missing |
Valor de Compra has 965061 (47.5%) missing values | Missing |
Reproduction
| Analysis started | 2022-09-21 00:30:45.576873 |
|---|---|
| Analysis finished | 2022-09-21 00:32:27.357115 |
| Duration | 1 minute and 41.78 seconds |
| Software version | pandas-profiling v3.3.0 |
| Download configuration | config.json |
| Distinct | 36 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| 2004-02 | 106032 |
|---|---|
| 2005-01 | 105316 |
| 2007-01 | 91080 |
| 2006-01 | 84959 |
| 2006-02 | 80790 |
| Other values (31) |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Characters and Unicode
| Total characters | 14224014 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2020-01 |
|---|---|
| 2nd row | 2020-01 |
| 3rd row | 2020-01 |
| 4th row | 2020-01 |
| 5th row | 2020-01 |
Common Values
| Value | Count | Frequency (%) |
| 2004-02 | 106032 | 5.2% |
| 2005-01 | 105316 | 5.2% |
| 2007-01 | 91080 | 4.5% |
| 2006-01 | 84959 | 4.2% |
| 2006-02 | 80790 | 4.0% |
| 2005-02 | 77088 | 3.8% |
| 2007-02 | 69732 | 3.4% |
| 2008-02 | 64453 | 3.2% |
| 2014-02 | 63522 | 3.1% |
| 2013-02 | 63111 | 3.1% |
| Other values (26) | 1225919 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 2004-02 | 106032 | 5.2% |
| 2005-01 | 105316 | 5.2% |
| 2007-01 | 91080 | 4.5% |
| 2006-01 | 84959 | 4.2% |
| 2006-02 | 80790 | 4.0% |
| 2005-02 | 77088 | 3.8% |
| 2007-02 | 69732 | 3.4% |
| 2008-02 | 64453 | 3.2% |
| 2014-02 | 63522 | 3.1% |
| 2013-02 | 63111 | 3.1% |
| Other values (26) | 1225919 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5128535 | |
| 2 | 3289103 | |
| 1 | 2240762 | |
| - | 2032002 | 14.3% |
| 5 | 278602 | 2.0% |
| 6 | 251194 | 1.8% |
| 4 | 243597 | 1.7% |
| 7 | 229273 | 1.6% |
| 8 | 210189 | 1.5% |
| 9 | 195947 | 1.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 12192012 | |
| Dash Punctuation | 2032002 | 14.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5128535 | |
| 2 | 3289103 | |
| 1 | 2240762 | |
| 5 | 278602 | 2.3% |
| 6 | 251194 | 2.1% |
| 4 | 243597 | 2.0% |
| 7 | 229273 | 1.9% |
| 8 | 210189 | 1.7% |
| 9 | 195947 | 1.6% |
| 3 | 124810 | 1.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2032002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 14224014 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5128535 | |
| 2 | 3289103 | |
| 1 | 2240762 | |
| - | 2032002 | 14.3% |
| 5 | 278602 | 2.0% |
| 6 | 251194 | 1.8% |
| 4 | 243597 | 1.7% |
| 7 | 229273 | 1.6% |
| 8 | 210189 | 1.5% |
| 9 | 195947 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14224014 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5128535 | |
| 2 | 3289103 | |
| 1 | 2240762 | |
| - | 2032002 | 14.3% |
| 5 | 278602 | 2.0% |
| 6 | 251194 | 1.8% |
| 4 | 243597 | 1.7% |
| 7 | 229273 | 1.6% |
| 8 | 210189 | 1.5% |
| 9 | 195947 | 1.4% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| SE |
|---|
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 4064004 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | SE |
|---|---|
| 2nd row | SE |
| 3rd row | SE |
| 4th row | SE |
| 5th row | SE |
Common Values
| Value | Count | Frequency (%) |
| SE | 2032002 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| se | 2032002 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 2032002 | |
| E | 2032002 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 4064004 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 2032002 | |
| E | 2032002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4064004 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 2032002 | |
| E | 2032002 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4064004 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 2032002 | |
| E | 2032002 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| SP |
|---|
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 4064004 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | SP |
|---|---|
| 2nd row | SP |
| 3rd row | SP |
| 4th row | SP |
| 5th row | SP |
Common Values
| Value | Count | Frequency (%) |
| SP | 2032002 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| sp | 2032002 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 2032002 | |
| P | 2032002 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 4064004 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 2032002 | |
| P | 2032002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4064004 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 2032002 | |
| P | 2032002 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4064004 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 2032002 | |
| P | 2032002 |
| Distinct | 127 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| SAO PAULO | |
|---|---|
| CAMPINAS | 58556 |
| RIBEIRAO PRETO | 50636 |
| SANTO ANDRE | 36754 |
| SAO JOSE DOS CAMPOS | 33787 |
| Other values (122) |
Length
| Max length | 23 |
|---|---|
| Median length | 19 |
| Mean length | 9.755570122 |
| Min length | 3 |
Characters and Unicode
| Total characters | 19823338 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | GUARULHOS |
|---|---|
| 2nd row | ADAMANTINA |
| 3rd row | ADAMANTINA |
| 4th row | ADAMANTINA |
| 5th row | ADAMANTINA |
Common Values
| Value | Count | Frequency (%) |
| SAO PAULO | 301391 | 14.8% |
| CAMPINAS | 58556 | 2.9% |
| RIBEIRAO PRETO | 50636 | 2.5% |
| SANTO ANDRE | 36754 | 1.8% |
| SAO JOSE DOS CAMPOS | 33787 | 1.7% |
| BAURU | 33542 | 1.7% |
| SOROCABA | 31952 | 1.6% |
| SAO JOSE DO RIO PRETO | 31769 | 1.6% |
| OSASCO | 31372 | 1.5% |
| SANTOS | 30609 | 1.5% |
| Other values (117) | 1391634 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| sao | 477955 | 14.6% |
| paulo | 301391 | 9.2% |
| do | 89291 | 2.7% |
| preto | 82405 | 2.5% |
| jose | 73842 | 2.3% |
| ribeirao | 60071 | 1.8% |
| campinas | 58556 | 1.8% |
| rio | 55773 | 1.7% |
| da | 48390 | 1.5% |
| mogi | 46316 | 1.4% |
| Other values (148) | 1974001 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 3828722 | |
| O | 2191941 | |
| S | 1442362 | 7.3% |
| R | 1392852 | 7.0% |
| I | 1344711 | 6.8% |
| 1235989 | 6.2% | |
| U | 1116950 | 5.6% |
| E | 977248 | 4.9% |
| T | 904232 | 4.6% |
| P | 846626 | 4.3% |
| Other values (14) | 4541705 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 18570365 | |
| Space Separator | 1235989 | 6.2% |
| Other Punctuation | 16984 | 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 3828722 | |
| O | 2191941 | |
| S | 1442362 | 7.8% |
| R | 1392852 | 7.5% |
| I | 1344711 | 7.2% |
| U | 1116950 | 6.0% |
| E | 977248 | 5.3% |
| T | 904232 | 4.9% |
| P | 846626 | 4.6% |
| N | 754358 | 4.1% |
| Other values (12) | 3770363 |
Space Separator
| Value | Count | Frequency (%) |
| 1235989 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 16984 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 18570365 | |
| Common | 1252973 | 6.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 3828722 | |
| O | 2191941 | |
| S | 1442362 | 7.8% |
| R | 1392852 | 7.5% |
| I | 1344711 | 7.2% |
| U | 1116950 | 6.0% |
| E | 977248 | 5.3% |
| T | 904232 | 4.9% |
| P | 846626 | 4.6% |
| N | 754358 | 4.1% |
| Other values (12) | 3770363 |
Common
| Value | Count | Frequency (%) |
| 1235989 | ||
| ' | 16984 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19823338 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 3828722 | |
| O | 2191941 | |
| S | 1442362 | 7.3% |
| R | 1392852 | 7.0% |
| I | 1344711 | 6.8% |
| 1235989 | 6.2% | |
| U | 1116950 | 5.6% |
| E | 977248 | 4.9% |
| T | 904232 | 4.6% |
| P | 846626 | 4.3% |
| Other values (14) | 4541705 |
| Distinct | 9618 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| COMPANHIA BRASILEIRA DE DISTRIBUICAO | 12424 |
|---|---|
| SETE ESTRELAS COMERCIO DE DERIVADOS DE PETROLEO LTDA | 12410 |
| CARREFOUR COMERCIO E INDUSTRIA LTDA | 10273 |
| REDE DE POSTOS SETE ESTRELAS LTDA | 9489 |
| REDE LK DE POSTOS LTDA | 4957 |
| Other values (9613) |
Length
| Max length | 84 |
|---|---|
| Median length | 72 |
| Mean length | 30.67956528 |
| Min length | 8 |
Characters and Unicode
| Total characters | 62340938 |
|---|---|
| Distinct characters | 83 |
| Distinct categories | 12 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 4 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 147 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | AUTO POSTO SAKAMOTO LTDA |
|---|---|
| 2nd row | REDE GAZOLI AUTO POSTO LTDA. |
| 3rd row | AUTO POSTO CARREIRO LTDA |
| 4th row | AUTO POSTO PROGRESSO DE ADAMANTINA LTDA |
| 5th row | MARCIO A SPOSITO TRANSPORTES LTDA |
Common Values
| Value | Count | Frequency (%) |
| COMPANHIA BRASILEIRA DE DISTRIBUICAO | 12424 | 0.6% |
| SETE ESTRELAS COMERCIO DE DERIVADOS DE PETROLEO LTDA | 12410 | 0.6% |
| CARREFOUR COMERCIO E INDUSTRIA LTDA | 10273 | 0.5% |
| REDE DE POSTOS SETE ESTRELAS LTDA | 9489 | 0.5% |
| REDE LK DE POSTOS LTDA | 4957 | 0.2% |
| COMPETRO COMERCIO E DISTRIBUICAO DE DERIVADOS DE PETROLEO LTDA | 4450 | 0.2% |
| MAKRO ATACADISTA S.A | 3234 | 0.2% |
| FELIMAR AUTO POSTO LTDA. | 2672 | 0.1% |
| AUTO POSTO AVENIDA LTDA | 2493 | 0.1% |
| COOPERCITRUS COOPERATIVA DE PRODUTORES RURAIS | 2447 | 0.1% |
| Other values (9608) | 1967153 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| ltda | 1877750 | |
| posto | 1514455 | 14.5% |
| auto | 1223404 | 11.7% |
| de | 625387 | 6.0% |
| 192570 | 1.8% | |
| e | 116411 | 1.1% |
| comercio | 109031 | 1.0% |
| combustiveis | 91982 | 0.9% |
| servicos | 90330 | 0.9% |
| centro | 77369 | 0.7% |
| Other values (6791) | 4519597 |
Most occurring characters
| Value | Count | Frequency (%) |
| 8424173 | ||
| O | 7433787 | |
| A | 7089023 | |
| T | 6378944 | |
| S | 3601155 | 5.8% |
| E | 3512286 | 5.6% |
| D | 3369019 | 5.4% |
| L | 3079403 | 4.9% |
| I | 3026277 | 4.9% |
| R | 2671459 | 4.3% |
| Other values (73) | 13755412 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 52115068 | |
| Space Separator | 8424173 | 13.5% |
| Other Punctuation | 546984 | 0.9% |
| Lowercase Letter | 324513 | 0.5% |
| Other Symbol | 271784 | 0.4% |
| Currency Symbol | 244993 | 0.4% |
| Decimal Number | 181202 | 0.3% |
| Dash Punctuation | 100600 | 0.2% |
| Other Letter | 92904 | 0.1% |
| Other Number | 35357 | 0.1% |
| Other values (2) | 3360 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 7433787 | |
| A | 7089023 | |
| T | 6378944 | |
| S | 3601155 | 6.9% |
| E | 3512286 | 6.7% |
| D | 3369019 | 6.5% |
| L | 3079403 | 5.9% |
| I | 3026277 | 5.8% |
| R | 2671459 | 5.1% |
| P | 2317998 | 4.4% |
| Other values (24) | 9635717 |
Lowercase Letter
| Value | Count | Frequency (%) |
| õ | 246466 | |
| ó | 44438 | 13.7% |
| è | 31720 | 9.8% |
| ò | 1867 | 0.6% |
| a | 4 | < 0.1% |
| r | 3 | < 0.1% |
| e | 3 | < 0.1% |
| i | 3 | < 0.1% |
| u | 2 | < 0.1% |
| f | 1 | < 0.1% |
| Other values (6) | 6 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 46544 | |
| 1 | 33460 | |
| 2 | 27892 | |
| 3 | 15900 | 8.8% |
| 5 | 15176 | 8.4% |
| 4 | 12621 | 7.0% |
| 6 | 9375 | 5.2% |
| 9 | 7091 | 3.9% |
| 7 | 7087 | 3.9% |
| 8 | 6056 | 3.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 421924 | |
| & | 107663 | 19.7% |
| ' | 6740 | 1.2% |
| ¿ | 4645 | 0.8% |
| , | 3921 | 0.7% |
| " | 1712 | 0.3% |
| % | 181 | < 0.1% |
| / | 156 | < 0.1% |
| # | 42 | < 0.1% |
Other Symbol
| Value | Count | Frequency (%) |
| ├ | 242201 | |
| ╝ | 15221 | 5.6% |
| ┤ | 11024 | 4.1% |
| ╡ | 2300 | 0.8% |
| ▓ | 968 | 0.4% |
| ┬ | 69 | < 0.1% |
| ║ | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 8424173 |
Currency Symbol
| Value | Count | Frequency (%) |
| £ | 244993 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 100600 |
Other Letter
| Value | Count | Frequency (%) |
| º | 92904 |
Other Number
| Value | Count | Frequency (%) |
| ¼ | 35357 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 1773 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 1587 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 52284662 | |
| Common | 9808453 | 15.7% |
| Greek | 247823 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| O | 7433787 | |
| A | 7089023 | |
| T | 6378944 | |
| S | 3601155 | 6.9% |
| E | 3512286 | 6.7% |
| D | 3369019 | 6.4% |
| L | 3079403 | 5.9% |
| I | 3026277 | 5.8% |
| R | 2671459 | 5.1% |
| P | 2317998 | 4.4% |
| Other values (40) | 9805311 |
Common
| Value | Count | Frequency (%) |
| 8424173 | ||
| . | 421924 | 4.3% |
| £ | 244993 | 2.5% |
| ├ | 242201 | 2.5% |
| & | 107663 | 1.1% |
| - | 100600 | 1.0% |
| 0 | 46544 | 0.5% |
| ¼ | 35357 | 0.4% |
| 1 | 33460 | 0.3% |
| 2 | 27892 | 0.3% |
| Other values (22) | 123646 | 1.3% |
Greek
| Value | Count | Frequency (%) |
| Γ | 247823 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 61112929 | |
| None | 956225 | 1.5% |
| Box Drawing | 270816 | 0.4% |
| Block Elements | 968 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 8424173 | ||
| O | 7433787 | |
| A | 7089023 | |
| T | 6378944 | |
| S | 3601155 | 5.9% |
| E | 3512286 | 5.7% |
| D | 3369019 | 5.5% |
| L | 3079403 | 5.0% |
| I | 3026277 | 5.0% |
| R | 2671459 | 4.4% |
| Other values (50) | 12527403 |
None
| Value | Count | Frequency (%) |
| Γ | 247823 | |
| õ | 246466 | |
| £ | 244993 | |
| º | 92904 | 9.7% |
| ó | 44438 | 4.6% |
| ¼ | 35357 | 3.7% |
| è | 31720 | 3.3% |
| ¿ | 4645 | 0.5% |
| Ò | 3264 | 0.3% |
| ò | 1867 | 0.2% |
| Other values (6) | 2748 | 0.3% |
Box Drawing
| Value | Count | Frequency (%) |
| ├ | 242201 | |
| ╝ | 15221 | 5.6% |
| ┤ | 11024 | 4.1% |
| ╡ | 2300 | 0.8% |
| ┬ | 69 | < 0.1% |
| ║ | 1 | < 0.1% |
Block Elements
| Value | Count | Frequency (%) |
| ▓ | 968 |
| Distinct | 9925 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| 58.444.084/0001-32 | 865 |
|---|---|
| 44.610.343/0001-43 | 860 |
| 66.706.243/0001-58 | 860 |
| 01.215.172/0001-45 | 857 |
| 48.192.819/0001-24 | 856 |
| Other values (9920) |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Characters and Unicode
| Total characters | 38608038 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 149 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 49.051.667/0001-02 |
|---|---|
| 2nd row | 09.116.143/0001-38 |
| 3rd row | 55.451.876/0001-46 |
| 4th row | 52.605.052/0001-95 |
| 5th row | 54.187.588/0002-44 |
Common Values
| Value | Count | Frequency (%) |
| 58.444.084/0001-32 | 865 | < 0.1% |
| 44.610.343/0001-43 | 860 | < 0.1% |
| 66.706.243/0001-58 | 860 | < 0.1% |
| 01.215.172/0001-45 | 857 | < 0.1% |
| 48.192.819/0001-24 | 856 | < 0.1% |
| 03.687.679/0001-27 | 856 | < 0.1% |
| 55.845.366/0001-53 | 850 | < 0.1% |
| 65.755.688/0001-65 | 844 | < 0.1% |
| 03.419.315/0001-66 | 840 | < 0.1% |
| 59.532.275/0001-19 | 839 | < 0.1% |
| Other values (9915) | 2023475 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 58.444.084/0001-32 | 865 | < 0.1% |
| 66.706.243/0001-58 | 860 | < 0.1% |
| 44.610.343/0001-43 | 860 | < 0.1% |
| 01.215.172/0001-45 | 857 | < 0.1% |
| 48.192.819/0001-24 | 856 | < 0.1% |
| 03.687.679/0001-27 | 856 | < 0.1% |
| 55.845.366/0001-53 | 850 | < 0.1% |
| 65.755.688/0001-65 | 844 | < 0.1% |
| 03.419.315/0001-66 | 840 | < 0.1% |
| 59.532.275/0001-19 | 839 | < 0.1% |
| Other values (9915) | 2023475 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 9135989 | |
| . | 4064004 | |
| 1 | 3828434 | |
| 4 | 2221164 | 5.8% |
| 5 | 2172231 | 5.6% |
| 2032002 | 5.3% | |
| / | 2032002 | 5.3% |
| - | 2032002 | 5.3% |
| 6 | 1961845 | 5.1% |
| 2 | 1915645 | 5.0% |
| Other values (4) | 7212720 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 28448028 | |
| Other Punctuation | 6096006 | 15.8% |
| Space Separator | 2032002 | 5.3% |
| Dash Punctuation | 2032002 | 5.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 9135989 | |
| 1 | 3828434 | |
| 4 | 2221164 | 7.8% |
| 5 | 2172231 | 7.6% |
| 6 | 1961845 | 6.9% |
| 2 | 1915645 | 6.7% |
| 3 | 1909265 | 6.7% |
| 9 | 1774651 | 6.2% |
| 7 | 1772043 | 6.2% |
| 8 | 1756761 | 6.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 4064004 | |
| / | 2032002 |
Space Separator
| Value | Count | Frequency (%) |
| 2032002 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2032002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 38608038 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 9135989 | |
| . | 4064004 | |
| 1 | 3828434 | |
| 4 | 2221164 | 5.8% |
| 5 | 2172231 | 5.6% |
| 2032002 | 5.3% | |
| / | 2032002 | 5.3% |
| - | 2032002 | 5.3% |
| 6 | 1961845 | 5.1% |
| 2 | 1915645 | 5.0% |
| Other values (4) | 7212720 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 38608038 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 9135989 | |
| . | 4064004 | |
| 1 | 3828434 | |
| 4 | 2221164 | 5.8% |
| 5 | 2172231 | 5.6% |
| 2032002 | 5.3% | |
| / | 2032002 | 5.3% |
| - | 2032002 | 5.3% |
| 6 | 1961845 | 5.1% |
| 2 | 1915645 | 5.0% |
| Other values (4) | 7212720 |
| Distinct | 5284 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| AVENIDA BRASIL | 18323 |
|---|---|
| RODOVIA PRESIDENTE DUTRA | 10852 |
| AVENIDA PRESIDENTE KENNEDY | 10359 |
| AVENIDA PRESIDENTE VARGAS | 10164 |
| AVENIDA INDEPENDENCIA | 8849 |
| Other values (5279) |
Length
| Max length | 62 |
|---|---|
| Median length | 50 |
| Mean length | 22.74089986 |
| Min length | 5 |
Characters and Unicode
| Total characters | 46209554 |
|---|---|
| Distinct characters | 79 |
| Distinct categories | 14 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 4 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 68 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | RODOVIA PRESIDENTE DUTRA |
|---|---|
| 2nd row | AVENIDA MARECHAL CASTELO BRANCO |
| 3rd row | AVENIDA CAP JOSE A DE OLIVEIRA |
| 4th row | AVENIDA RIO BRANCO |
| 5th row | AVENIDA RIO BRANCO |
Common Values
| Value | Count | Frequency (%) |
| AVENIDA BRASIL | 18323 | 0.9% |
| RODOVIA PRESIDENTE DUTRA | 10852 | 0.5% |
| AVENIDA PRESIDENTE KENNEDY | 10359 | 0.5% |
| AVENIDA PRESIDENTE VARGAS | 10164 | 0.5% |
| AVENIDA INDEPENDENCIA | 8849 | 0.4% |
| AVENIDA WASHINGTON LUIZ | 8159 | 0.4% |
| RUA FLORIANO PEIXOTO | 7471 | 0.4% |
| AVENIDA RIO BRANCO | 7151 | 0.4% |
| AVENIDA TIRADENTES | 6928 | 0.3% |
| AVENIDA SANTOS DUMONT | 6264 | 0.3% |
| Other values (5274) | 1937482 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| avenida | 1074178 | 15.2% |
| rua | 728623 | 10.3% |
| de | 295010 | 4.2% |
| rodovia | 106318 | 1.5% |
| da | 77338 | 1.1% |
| jose | 75230 | 1.1% |
| antonio | 59827 | 0.8% |
| joao | 56924 | 0.8% |
| presidente | 55059 | 0.8% |
| do | 54360 | 0.8% |
| Other values (3877) | 4482766 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 7090114 | |
| 5084452 | ||
| E | 4090637 | 8.9% |
| I | 3601905 | 7.8% |
| O | 3416705 | 7.4% |
| R | 3414528 | 7.4% |
| N | 2879955 | 6.2% |
| D | 2764942 | 6.0% |
| S | 1889396 | 4.1% |
| U | 1689451 | 3.7% |
| Other values (69) | 10287469 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 40335004 | |
| Space Separator | 5084452 | 11.0% |
| Decimal Number | 208186 | 0.5% |
| Lowercase Letter | 184260 | 0.4% |
| Other Symbol | 133066 | 0.3% |
| Currency Symbol | 117281 | 0.3% |
| Other Punctuation | 97276 | 0.2% |
| Other Letter | 20432 | < 0.1% |
| Dash Punctuation | 12791 | < 0.1% |
| Close Punctuation | 5037 | < 0.1% |
| Other values (4) | 11769 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 7090114 | |
| E | 4090637 | |
| I | 3601905 | |
| O | 3416705 | |
| R | 3414528 | |
| N | 2879955 | 7.1% |
| D | 2764942 | 6.9% |
| S | 1889396 | 4.7% |
| U | 1689451 | 4.2% |
| V | 1595497 | 4.0% |
| Other values (24) | 7901874 |
Lowercase Letter
| Value | Count | Frequency (%) |
| õ | 116697 | |
| ó | 51611 | |
| è | 14105 | 7.7% |
| ò | 1508 | 0.8% |
| ç | 326 | 0.2% |
| a | 3 | < 0.1% |
| n | 2 | < 0.1% |
| o | 2 | < 0.1% |
| u | 1 | < 0.1% |
| e | 1 | < 0.1% |
| Other values (4) | 4 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 41552 | |
| 3 | 32964 | |
| 2 | 31636 | |
| 0 | 21433 | |
| 4 | 21264 | |
| 9 | 14823 | 7.1% |
| 7 | 14567 | 7.0% |
| 5 | 13859 | 6.7% |
| 6 | 11182 | 5.4% |
| 8 | 4906 | 2.4% |
Other Symbol
| Value | Count | Frequency (%) |
| ├ | 113912 | |
| ╝ | 11578 | 8.7% |
| ┤ | 3847 | 2.9% |
| ╡ | 2256 | 1.7% |
| ▓ | 1369 | 1.0% |
| ┬ | 104 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 74514 | |
| , | 12866 | 13.2% |
| / | 3827 | 3.9% |
| ' | 3001 | 3.1% |
| ¿ | 2528 | 2.6% |
| : | 540 | 0.6% |
Space Separator
| Value | Count | Frequency (%) |
| 5084452 |
Currency Symbol
| Value | Count | Frequency (%) |
| £ | 117281 |
Other Letter
| Value | Count | Frequency (%) |
| º | 20432 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 12791 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 5037 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 5037 |
Other Number
| Value | Count | Frequency (%) |
| ¼ | 3974 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 1632 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 1126 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 40421561 | |
| Common | 5669858 | 12.3% |
| Greek | 118135 | 0.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 7090114 | |
| E | 4090637 | |
| I | 3601905 | |
| O | 3416705 | |
| R | 3414528 | |
| N | 2879955 | 7.1% |
| D | 2764942 | 6.8% |
| S | 1889396 | 4.7% |
| U | 1689451 | 4.2% |
| V | 1595497 | 3.9% |
| Other values (38) | 7988431 |
Common
| Value | Count | Frequency (%) |
| 5084452 | ||
| £ | 117281 | 2.1% |
| ├ | 113912 | 2.0% |
| . | 74514 | 1.3% |
| 1 | 41552 | 0.7% |
| 3 | 32964 | 0.6% |
| 2 | 31636 | 0.6% |
| 0 | 21433 | 0.4% |
| 4 | 21264 | 0.4% |
| 9 | 14823 | 0.3% |
| Other values (20) | 116027 | 2.0% |
Greek
| Value | Count | Frequency (%) |
| Γ | 118135 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 45627117 | |
| None | 449371 | 1.0% |
| Box Drawing | 131697 | 0.3% |
| Block Elements | 1369 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 7090114 | |
| 5084452 | ||
| E | 4090637 | |
| I | 3601905 | 7.9% |
| O | 3416705 | 7.5% |
| R | 3414528 | 7.5% |
| N | 2879955 | 6.3% |
| D | 2764942 | 6.1% |
| S | 1889396 | 4.1% |
| U | 1689451 | 3.7% |
| Other values (46) | 9705032 |
None
| Value | Count | Frequency (%) |
| Γ | 118135 | |
| £ | 117281 | |
| õ | 116697 | |
| ó | 51611 | |
| º | 20432 | 4.5% |
| è | 14105 | 3.1% |
| ¼ | 3974 | 0.9% |
| ¿ | 2528 | 0.6% |
| ò | 1508 | 0.3% |
| À | 1127 | 0.3% |
| Other values (7) | 1973 | 0.4% |
Box Drawing
| Value | Count | Frequency (%) |
| ├ | 113912 | |
| ╝ | 11578 | 8.8% |
| ┤ | 3847 | 2.9% |
| ╡ | 2256 | 1.7% |
| ┬ | 104 | 0.1% |
Block Elements
| Value | Count | Frequency (%) |
| ▓ | 1369 |
| Distinct | 3137 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 152 |
| Missing (%) | < 0.1% |
| Memory size | 15.5 MiB |
| S/N | 108763 |
|---|---|
| 15 | 11559 |
| 30 | 9215 |
| 300 | 8626 |
| 10 | 8555 |
| Other values (3132) |
Length
| Max length | 15 |
|---|---|
| Median length | 11 |
| Mean length | 3.311819278 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6729120 |
|---|---|
| Distinct characters | 42 |
| Distinct categories | 9 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 3 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 24 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | S/N |
|---|---|
| 2nd row | 15 |
| 3rd row | 160 |
| 4th row | 764 |
| 5th row | 1625 |
Common Values
| Value | Count | Frequency (%) |
| S/N | 108763 | 5.4% |
| 15 | 11559 | 0.6% |
| 30 | 9215 | 0.5% |
| 300 | 8626 | 0.4% |
| 10 | 8555 | 0.4% |
| SN | 8326 | 0.4% |
| 25 | 7861 | 0.4% |
| 600 | 7671 | 0.4% |
| 20 | 7267 | 0.4% |
| 21 | 7210 | 0.4% |
| Other values (3127) | 1846797 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| s/n | 109969 | 5.4% |
| 15 | 11559 | 0.6% |
| 30 | 9215 | 0.5% |
| 300 | 8626 | 0.4% |
| 10 | 8555 | 0.4% |
| sn | 8326 | 0.4% |
| 25 | 8179 | 0.4% |
| 600 | 7687 | 0.4% |
| 21 | 7549 | 0.4% |
| 20 | 7267 | 0.4% |
| Other values (3101) | 1853428 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 1094054 | |
| 0 | 855712 | |
| 2 | 717234 | |
| 5 | 672921 | |
| 3 | 641447 | |
| 4 | 558498 | |
| 6 | 482382 | |
| 7 | 451592 | |
| 9 | 415748 | 6.2% |
| 8 | 414964 | 6.2% |
| Other values (32) | 424568 | 6.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6304552 | |
| Uppercase Letter | 248589 | 3.7% |
| Other Punctuation | 149976 | 2.2% |
| Dash Punctuation | 16489 | 0.2% |
| Space Separator | 8878 | 0.1% |
| Lowercase Letter | 556 | < 0.1% |
| Other Number | 76 | < 0.1% |
| Currency Symbol | 2 | < 0.1% |
| Other Symbol | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| m | 147 | |
| õ | 78 | |
| ò | 78 | |
| a | 67 | |
| i | 55 | 9.9% |
| j | 32 | 5.8% |
| u | 32 | 5.8% |
| l | 32 | 5.8% |
| g | 12 | 2.2% |
| o | 12 | 2.2% |
| Other values (4) | 11 | 2.0% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1094054 | |
| 0 | 855712 | |
| 2 | 717234 | |
| 5 | 672921 | |
| 3 | 641447 | |
| 4 | 558498 | |
| 6 | 482382 | |
| 7 | 451592 | |
| 9 | 415748 | 6.6% |
| 8 | 414964 | 6.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 121479 | |
| N | 120990 | |
| K | 2136 | 0.9% |
| M | 2044 | 0.8% |
| B | 790 | 0.3% |
| R | 789 | 0.3% |
| Γ | 156 | 0.1% |
| A | 88 | < 0.1% |
| À | 78 | < 0.1% |
| L | 39 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 116676 | |
| , | 31678 | 21.1% |
| . | 1622 | 1.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 16489 |
Space Separator
| Value | Count | Frequency (%) |
| 8878 |
Other Number
| Value | Count | Frequency (%) |
| ¼ | 76 |
Currency Symbol
| Value | Count | Frequency (%) |
| £ | 2 |
Other Symbol
| Value | Count | Frequency (%) |
| ├ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6479975 | |
| Latin | 248989 | 3.7% |
| Greek | 156 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 121479 | |
| N | 120990 | |
| K | 2136 | 0.9% |
| M | 2044 | 0.8% |
| B | 790 | 0.3% |
| R | 789 | 0.3% |
| m | 147 | 0.1% |
| A | 88 | < 0.1% |
| õ | 78 | < 0.1% |
| ò | 78 | < 0.1% |
| Other values (13) | 370 | 0.1% |
Common
| Value | Count | Frequency (%) |
| 1 | 1094054 | |
| 0 | 855712 | |
| 2 | 717234 | |
| 5 | 672921 | |
| 3 | 641447 | |
| 4 | 558498 | |
| 6 | 482382 | |
| 7 | 451592 | |
| 9 | 415748 | 6.4% |
| 8 | 414964 | 6.4% |
| Other values (8) | 175423 | 2.7% |
Greek
| Value | Count | Frequency (%) |
| Γ | 156 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6728648 | |
| None | 470 | < 0.1% |
| Box Drawing | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 1094054 | |
| 0 | 855712 | |
| 2 | 717234 | |
| 5 | 672921 | |
| 3 | 641447 | |
| 4 | 558498 | |
| 6 | 482382 | |
| 7 | 451592 | |
| 9 | 415748 | 6.2% |
| 8 | 414964 | 6.2% |
| Other values (24) | 424096 | 6.3% |
None
| Value | Count | Frequency (%) |
| Γ | 156 | |
| õ | 78 | |
| ò | 78 | |
| À | 78 | |
| ¼ | 76 | |
| £ | 2 | 0.4% |
| ç | 2 | 0.4% |
Box Drawing
| Value | Count | Frequency (%) |
| ├ | 2 |
| Distinct | 747 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 1741308 |
| Missing (%) | 85.7% |
| Memory size | 15.5 MiB |
| 0 | |
|---|---|
| TERREO | 6229 |
| SETOR I | 4917 |
| A | 4854 |
| Other values (742) |
Length
| Max length | 113 |
|---|---|
| Median length | 108 |
| Mean length | 7.164774643 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2082757 |
|---|---|
| Distinct characters | 59 |
| Distinct categories | 12 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 4 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 18 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | KM 210,5-SENT SP/RJ |
|---|---|
| 2nd row | ESQ.R.RACHID KASSOUF |
| 3rd row | KM. 47,5 |
| 4th row | KM 50 |
| 5th row | KM 40 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 58739 | 2.9% |
| 37737 | 1.9% | |
| TERREO | 6229 | 0.3% |
| SETOR I | 4917 | 0.2% |
| A | 4854 | 0.2% |
| PARTE | 3438 | 0.2% |
| POSTO | 2569 | 0.1% |
| - | 2008 | 0.1% |
| B | 1531 | 0.1% |
| . | 1280 | 0.1% |
| Other values (737) | 167392 | 8.2% |
| (Missing) | 1741308 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| km | 92593 | 17.4% |
| 0 | 59518 | 11.2% |
| 17502 | 3.3% | |
| a | 12154 | 2.3% |
| lote | 8435 | 1.6% |
| parte | 7183 | 1.3% |
| terreo | 6910 | 1.3% |
| quadra | 6072 | 1.1% |
| i | 5905 | 1.1% |
| setor | 5848 | 1.1% |
| Other values (841) | 311148 |
Most occurring characters
| Value | Count | Frequency (%) |
| 428384 | ||
| M | 118799 | 5.7% |
| 0 | 118574 | 5.7% |
| A | 114237 | 5.5% |
| K | 94595 | 4.5% |
| E | 91334 | 4.4% |
| O | 84446 | 4.1% |
| R | 77242 | 3.7% |
| 1 | 75786 | 3.6% |
| T | 68874 | 3.3% |
| Other values (49) | 810486 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1024370 | |
| Decimal Number | 528253 | |
| Space Separator | 428384 | |
| Other Punctuation | 67320 | 3.2% |
| Dash Punctuation | 13563 | 0.7% |
| Math Symbol | 11943 | 0.6% |
| Lowercase Letter | 4272 | 0.2% |
| Other Symbol | 1760 | 0.1% |
| Currency Symbol | 1423 | 0.1% |
| Other Number | 1090 | 0.1% |
| Other values (2) | 379 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 118799 | |
| A | 114237 | |
| K | 94595 | 9.2% |
| E | 91334 | 8.9% |
| O | 84446 | 8.2% |
| R | 77242 | 7.5% |
| T | 68874 | 6.7% |
| S | 52351 | 5.1% |
| I | 44231 | 4.3% |
| N | 35734 | 3.5% |
| Other values (19) | 242527 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 118574 | |
| 1 | 75786 | |
| 2 | 66703 | |
| 5 | 50478 | |
| 3 | 46582 | 8.8% |
| 4 | 44639 | 8.5% |
| 6 | 35071 | 6.6% |
| 8 | 34589 | 6.5% |
| 7 | 32349 | 6.1% |
| 9 | 23482 | 4.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 23469 | |
| , | 22620 | |
| / | 18243 | |
| ; | 1566 | 2.3% |
| : | 1388 | 2.1% |
| ' | 34 | 0.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| õ | 2524 | |
| ò | 950 | 22.2% |
| ó | 655 | 15.3% |
| è | 143 | 3.3% |
Other Symbol
| Value | Count | Frequency (%) |
| ├ | 1398 | |
| ▓ | 198 | 11.2% |
| ╝ | 164 | 9.3% |
Space Separator
| Value | Count | Frequency (%) |
| 428384 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 13563 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 11943 |
Currency Symbol
| Value | Count | Frequency (%) |
| £ | 1423 |
Other Number
| Value | Count | Frequency (%) |
| ¼ | 1090 |
Other Letter
| Value | Count | Frequency (%) |
| º | 202 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 177 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1053913 | |
| Latin | 1025214 | |
| Greek | 3630 | 0.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| M | 118799 | |
| A | 114237 | |
| K | 94595 | 9.2% |
| E | 91334 | 8.9% |
| O | 84446 | 8.2% |
| R | 77242 | 7.5% |
| T | 68874 | 6.7% |
| S | 52351 | 5.1% |
| I | 44231 | 4.3% |
| N | 35734 | 3.5% |
| Other values (23) | 243371 |
Common
| Value | Count | Frequency (%) |
| 428384 | ||
| 0 | 118574 | 11.3% |
| 1 | 75786 | 7.2% |
| 2 | 66703 | 6.3% |
| 5 | 50478 | 4.8% |
| 3 | 46582 | 4.4% |
| 4 | 44639 | 4.2% |
| 6 | 35071 | 3.3% |
| 8 | 34589 | 3.3% |
| 7 | 32349 | 3.1% |
| Other values (15) | 120758 | 11.5% |
Greek
| Value | Count | Frequency (%) |
| Γ | 3630 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2069071 | |
| None | 11926 | 0.6% |
| Box Drawing | 1562 | 0.1% |
| Block Elements | 198 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 428384 | ||
| M | 118799 | 5.7% |
| 0 | 118574 | 5.7% |
| A | 114237 | 5.5% |
| K | 94595 | 4.6% |
| E | 91334 | 4.4% |
| O | 84446 | 4.1% |
| R | 77242 | 3.7% |
| 1 | 75786 | 3.7% |
| T | 68874 | 3.3% |
| Other values (35) | 796800 |
None
| Value | Count | Frequency (%) |
| Γ | 3630 | |
| õ | 2524 | |
| £ | 1423 | 11.9% |
| À | 1106 | 9.3% |
| ¼ | 1090 | 9.1% |
| ò | 950 | 8.0% |
| ó | 655 | 5.5% |
| º | 202 | 1.7% |
| Ú | 167 | 1.4% |
| è | 143 | 1.2% |
Box Drawing
| Value | Count | Frequency (%) |
| ├ | 1398 | |
| ╝ | 164 | 10.5% |
Block Elements
| Value | Count | Frequency (%) |
| ▓ | 198 |
| Distinct | 3652 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 5502 |
| Missing (%) | 0.3% |
| Memory size | 15.5 MiB |
| CENTRO | |
|---|---|
| ZONA RURAL | 13513 |
| JARDIM PAULISTA | 11156 |
| IPIRANGA | 9678 |
| BELA VISTA | 9426 |
| Other values (3647) |
Length
| Max length | 48 |
|---|---|
| Median length | 40 |
| Mean length | 11.96545769 |
| Min length | 1 |
Characters and Unicode
| Total characters | 24248000 |
|---|---|
| Distinct characters | 71 |
| Distinct categories | 12 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 4 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 45 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | BONSUCESSO |
|---|---|
| 2nd row | VILA JAMIL DE LIMA |
| 3rd row | CENTRO |
| 4th row | CENTRO |
| 5th row | VILA INDUSTRIAL |
Common Values
| Value | Count | Frequency (%) |
| CENTRO | 373609 | 18.4% |
| ZONA RURAL | 13513 | 0.7% |
| JARDIM PAULISTA | 11156 | 0.5% |
| IPIRANGA | 9678 | 0.5% |
| BELA VISTA | 9426 | 0.5% |
| VILA NOVA | 9423 | 0.5% |
| SANTANA | 8972 | 0.4% |
| JARDIM BELA VISTA | 7266 | 0.4% |
| SANTO AMARO | 6893 | 0.3% |
| JARDIM AMERICA | 6354 | 0.3% |
| Other values (3642) | 1570210 | |
| (Missing) | 5502 | 0.3% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| centro | 375653 | 9.9% |
| vila | 354119 | 9.3% |
| jardim | 326468 | 8.6% |
| jd | 81388 | 2.1% |
| santa | 72387 | 1.9% |
| parque | 64237 | 1.7% |
| sao | 58903 | 1.6% |
| nova | 56251 | 1.5% |
| do | 40504 | 1.1% |
| vista | 30400 | 0.8% |
| Other values (2395) | 2337226 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 3561254 | |
| I | 2019708 | 8.3% |
| R | 1966111 | 8.1% |
| 1776919 | 7.3% | |
| O | 1749819 | 7.2% |
| E | 1607983 | 6.6% |
| N | 1340560 | 5.5% |
| T | 1092667 | 4.5% |
| L | 1036693 | 4.3% |
| D | 997375 | 4.1% |
| Other values (61) | 7098911 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 21674276 | |
| Space Separator | 1776919 | 7.3% |
| Lowercase Letter | 259840 | 1.1% |
| Other Symbol | 213993 | 0.9% |
| Currency Symbol | 179816 | 0.7% |
| Other Punctuation | 78846 | 0.3% |
| Other Letter | 24734 | 0.1% |
| Other Number | 18622 | 0.1% |
| Decimal Number | 9028 | < 0.1% |
| Dash Punctuation | 4030 | < 0.1% |
| Other values (2) | 7896 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 3561254 | |
| I | 2019708 | 9.3% |
| R | 1966111 | 9.1% |
| O | 1749819 | 8.1% |
| E | 1607983 | 7.4% |
| N | 1340560 | 6.2% |
| T | 1092667 | 5.0% |
| L | 1036693 | 4.8% |
| D | 997375 | 4.6% |
| C | 973006 | 4.5% |
| Other values (25) | 5329100 |
Lowercase Letter
| Value | Count | Frequency (%) |
| õ | 176716 | |
| ó | 56338 | 21.7% |
| è | 26043 | 10.0% |
| ò | 737 | 0.3% |
| r | 1 | < 0.1% |
| a | 1 | < 0.1% |
| c | 1 | < 0.1% |
| e | 1 | < 0.1% |
| l | 1 | < 0.1% |
| i | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2875 | |
| 1 | 1733 | |
| 5 | 1301 | |
| 2 | 1157 | |
| 3 | 792 | 8.8% |
| 7 | 414 | 4.6% |
| 8 | 399 | 4.4% |
| 4 | 357 | 4.0% |
Other Symbol
| Value | Count | Frequency (%) |
| ├ | 175090 | |
| ╝ | 16932 | 7.9% |
| ┤ | 15950 | 7.5% |
| ▓ | 3808 | 1.8% |
| ╡ | 2030 | 0.9% |
| ┬ | 183 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 71300 | |
| ¿ | 4116 | 5.2% |
| / | 1680 | 2.1% |
| ' | 1373 | 1.7% |
| : | 377 | 0.5% |
Space Separator
| Value | Count | Frequency (%) |
| 1776919 |
Currency Symbol
| Value | Count | Frequency (%) |
| £ | 179816 |
Other Letter
| Value | Count | Frequency (%) |
| º | 24734 |
Other Number
| Value | Count | Frequency (%) |
| ¼ | 18622 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4030 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3948 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3948 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 21781592 | |
| Common | 2289150 | 9.4% |
| Greek | 177258 | 0.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 3561254 | |
| I | 2019708 | 9.3% |
| R | 1966111 | 9.0% |
| O | 1749819 | 8.0% |
| E | 1607983 | 7.4% |
| N | 1340560 | 6.2% |
| T | 1092667 | 5.0% |
| L | 1036693 | 4.8% |
| D | 997375 | 4.6% |
| C | 973006 | 4.5% |
| Other values (35) | 5436416 |
Common
| Value | Count | Frequency (%) |
| 1776919 | ||
| £ | 179816 | 7.9% |
| ├ | 175090 | 7.6% |
| . | 71300 | 3.1% |
| ¼ | 18622 | 0.8% |
| ╝ | 16932 | 0.7% |
| ┤ | 15950 | 0.7% |
| ¿ | 4116 | 0.2% |
| - | 4030 | 0.2% |
| ) | 3948 | 0.2% |
| Other values (15) | 22427 | 1.0% |
Greek
| Value | Count | Frequency (%) |
| Γ | 177258 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 23365680 | |
| None | 668327 | 2.8% |
| Box Drawing | 210185 | 0.9% |
| Block Elements | 3808 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 3561254 | |
| I | 2019708 | 8.6% |
| R | 1966111 | 8.4% |
| 1776919 | 7.6% | |
| O | 1749819 | 7.5% |
| E | 1607983 | 6.9% |
| N | 1340560 | 5.7% |
| T | 1092667 | 4.7% |
| L | 1036693 | 4.4% |
| D | 997375 | 4.3% |
| Other values (38) | 6216591 |
None
| Value | Count | Frequency (%) |
| £ | 179816 | |
| Γ | 177258 | |
| õ | 176716 | |
| ó | 56338 | 8.4% |
| è | 26043 | 3.9% |
| º | 24734 | 3.7% |
| ¼ | 18622 | 2.8% |
| ¿ | 4116 | 0.6% |
| Ò | 3074 | 0.5% |
| ò | 737 | 0.1% |
| Other values (7) | 873 | 0.1% |
Box Drawing
| Value | Count | Frequency (%) |
| ├ | 175090 | |
| ╝ | 16932 | 8.1% |
| ┤ | 15950 | 7.6% |
| ╡ | 2030 | 1.0% |
| ┬ | 183 | 0.1% |
Block Elements
| Value | Count | Frequency (%) |
| ▓ | 3808 |
| Distinct | 5609 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| 11680-000 | 11181 |
|---|---|
| 13140-000 | 8949 |
| 18900-000 | 8941 |
| 17400-000 | 8459 |
| 17900-000 | 8457 |
| Other values (5604) |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
Characters and Unicode
| Total characters | 18288018 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 50 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 07178-580 |
|---|---|
| 2nd row | 17800-000 |
| 3rd row | 17800-000 |
| 4th row | 17800-000 |
| 5th row | 17800-000 |
Common Values
| Value | Count | Frequency (%) |
| 11680-000 | 11181 | 0.6% |
| 13140-000 | 8949 | 0.4% |
| 18900-000 | 8941 | 0.4% |
| 17400-000 | 8459 | 0.4% |
| 17900-000 | 8457 | 0.4% |
| 19400-000 | 8403 | 0.4% |
| 15910-000 | 8306 | 0.4% |
| 15200-000 | 8286 | 0.4% |
| 12940-000 | 8107 | 0.4% |
| 14900-000 | 8091 | 0.4% |
| Other values (5599) | 1944822 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 11680-000 | 11181 | 0.6% |
| 13140-000 | 8949 | 0.4% |
| 18900-000 | 8941 | 0.4% |
| 17400-000 | 8459 | 0.4% |
| 17900-000 | 8457 | 0.4% |
| 19400-000 | 8403 | 0.4% |
| 15910-000 | 8306 | 0.4% |
| 15200-000 | 8286 | 0.4% |
| 12940-000 | 8107 | 0.4% |
| 14900-000 | 8091 | 0.4% |
| Other values (5599) | 1944822 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 6232824 | |
| 1 | 2705551 | |
| - | 2032002 | 11.1% |
| 3 | 1321571 | 7.2% |
| 2 | 1122619 | 6.1% |
| 5 | 1017867 | 5.6% |
| 4 | 1009631 | 5.5% |
| 7 | 785840 | 4.3% |
| 6 | 772820 | 4.2% |
| 8 | 710202 | 3.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 16256016 | |
| Dash Punctuation | 2032002 | 11.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 6232824 | |
| 1 | 2705551 | |
| 3 | 1321571 | 8.1% |
| 2 | 1122619 | 6.9% |
| 5 | 1017867 | 6.3% |
| 4 | 1009631 | 6.2% |
| 7 | 785840 | 4.8% |
| 6 | 772820 | 4.8% |
| 8 | 710202 | 4.4% |
| 9 | 577091 | 3.6% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2032002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 18288018 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 6232824 | |
| 1 | 2705551 | |
| - | 2032002 | 11.1% |
| 3 | 1321571 | 7.2% |
| 2 | 1122619 | 6.1% |
| 5 | 1017867 | 5.6% |
| 4 | 1009631 | 5.5% |
| 7 | 785840 | 4.3% |
| 6 | 772820 | 4.2% |
| 8 | 710202 | 3.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 18288018 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 6232824 | |
| 1 | 2705551 | |
| - | 2032002 | 11.1% |
| 3 | 1321571 | 7.2% |
| 2 | 1122619 | 6.1% |
| 5 | 1017867 | 5.6% |
| 4 | 1009631 | 5.5% |
| 7 | 785840 | 4.3% |
| 6 | 772820 | 4.2% |
| 8 | 710202 | 3.9% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| ETANOL |
|---|
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 12192012 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | ETANOL |
|---|---|
| 2nd row | ETANOL |
| 3rd row | ETANOL |
| 4th row | ETANOL |
| 5th row | ETANOL |
Common Values
| Value | Count | Frequency (%) |
| ETANOL | 2032002 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| etanol | 2032002 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 2032002 | |
| T | 2032002 | |
| A | 2032002 | |
| N | 2032002 | |
| O | 2032002 | |
| L | 2032002 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 12192012 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 2032002 | |
| T | 2032002 | |
| A | 2032002 | |
| N | 2032002 | |
| O | 2032002 | |
| L | 2032002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12192012 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 2032002 | |
| T | 2032002 | |
| A | 2032002 | |
| N | 2032002 | |
| O | 2032002 | |
| L | 2032002 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12192012 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| E | 2032002 | |
| T | 2032002 | |
| A | 2032002 | |
| N | 2032002 | |
| O | 2032002 | |
| L | 2032002 |
| Distinct | 3691 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| 31/01/2005 | 3069 |
|---|---|
| 28/02/2007 | 2661 |
| 16/11/2004 | 2592 |
| 28/03/2005 | 2561 |
| 03/01/2005 | 2440 |
| Other values (3686) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 20320020 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 23 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 03/01/2020 |
|---|---|
| 2nd row | 02/01/2020 |
| 3rd row | 02/01/2020 |
| 4th row | 02/01/2020 |
| 5th row | 02/01/2020 |
Common Values
| Value | Count | Frequency (%) |
| 31/01/2005 | 3069 | 0.2% |
| 28/02/2007 | 2661 | 0.1% |
| 16/11/2004 | 2592 | 0.1% |
| 28/03/2005 | 2561 | 0.1% |
| 03/01/2005 | 2440 | 0.1% |
| 08/03/2005 | 2387 | 0.1% |
| 07/03/2007 | 2338 | 0.1% |
| 18/07/2005 | 2322 | 0.1% |
| 11/04/2005 | 2316 | 0.1% |
| 21/03/2005 | 2306 | 0.1% |
| Other values (3681) | 2007010 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 31/01/2005 | 3069 | 0.2% |
| 28/02/2007 | 2661 | 0.1% |
| 16/11/2004 | 2592 | 0.1% |
| 28/03/2005 | 2561 | 0.1% |
| 03/01/2005 | 2440 | 0.1% |
| 08/03/2005 | 2387 | 0.1% |
| 07/03/2007 | 2338 | 0.1% |
| 18/07/2005 | 2322 | 0.1% |
| 11/04/2005 | 2316 | 0.1% |
| 21/03/2005 | 2306 | 0.1% |
| Other values (3681) | 2007010 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5582062 | |
| / | 4064004 | |
| 2 | 3470625 | |
| 1 | 2962780 | |
| 5 | 651505 | 3.2% |
| 6 | 631805 | 3.1% |
| 7 | 620033 | 3.1% |
| 4 | 609926 | 3.0% |
| 3 | 603377 | 3.0% |
| 8 | 593340 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 16256016 | |
| Other Punctuation | 4064004 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5582062 | |
| 2 | 3470625 | |
| 1 | 2962780 | |
| 5 | 651505 | 4.0% |
| 6 | 631805 | 3.9% |
| 7 | 620033 | 3.8% |
| 4 | 609926 | 3.8% |
| 3 | 603377 | 3.7% |
| 8 | 593340 | 3.6% |
| 9 | 530563 | 3.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 4064004 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 20320020 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5582062 | |
| / | 4064004 | |
| 2 | 3470625 | |
| 1 | 2962780 | |
| 5 | 651505 | 3.2% |
| 6 | 631805 | 3.1% |
| 7 | 620033 | 3.1% |
| 4 | 609926 | 3.0% |
| 3 | 603377 | 3.0% |
| 8 | 593340 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 20320020 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5582062 | |
| / | 4064004 | |
| 2 | 3470625 | |
| 1 | 2962780 | |
| 5 | 651505 | 3.2% |
| 6 | 631805 | 3.1% |
| 7 | 620033 | 3.1% |
| 4 | 609926 | 3.0% |
| 3 | 603377 | 3.0% |
| 8 | 593340 | 2.9% |
| Distinct | 3314 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.821610942 |
| Minimum | 0.59 |
|---|---|
| Maximum | 6.699 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.5 MiB |
Quantile statistics
| Minimum | 0.59 |
|---|---|
| 5-th percentile | 0.999 |
| Q1 | 1.298 |
| median | 1.699 |
| Q3 | 2.099 |
| 95-th percentile | 2.999 |
| Maximum | 6.699 |
| Range | 6.109 |
| Interquartile range (IQR) | 0.801 |
Descriptive statistics
| Standard deviation | 0.724623272 |
|---|---|
| Coefficient of variation (CV) | 0.3977925555 |
| Kurtosis | 3.71673882 |
| Mean | 1.821610942 |
| Median Absolute Deviation (MAD) | 0.4 |
| Skewness | 1.58819351 |
| Sum | 3701517.078 |
| Variance | 0.5250788863 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.899 | 84918 | 4.2% |
| 1.299 | 81050 | 4.0% |
| 1.799 | 77609 | 3.8% |
| 1.399 | 67965 | 3.3% |
| 1.999 | 65103 | 3.2% |
| 1.199 | 63921 | 3.1% |
| 1.699 | 49744 | 2.4% |
| 1.499 | 39660 | 2.0% |
| 1.099 | 36743 | 1.8% |
| 2.699 | 34593 | 1.7% |
| Other values (3304) | 1430696 |
| Value | Count | Frequency (%) |
| 0.59 | 7 | < 0.1% |
| 0.599 | 26 | < 0.1% |
| 0.6 | 2 | < 0.1% |
| 0.609 | 5 | < 0.1% |
| 0.61 | 4 | < 0.1% |
| 0.618 | 1 | < 0.1% |
| 0.619 | 66 | |
| 0.62 | 22 | < 0.1% |
| 0.628 | 2 | < 0.1% |
| 0.629 | 85 |
| Value | Count | Frequency (%) |
| 6.699 | 1 | < 0.1% |
| 6.299 | 1 | < 0.1% |
| 6.199 | 5 | < 0.1% |
| 6.197 | 1 | < 0.1% |
| 6.098 | 1 | < 0.1% |
| 6.047 | 2 | < 0.1% |
| 5.999 | 16 | |
| 5.998 | 2 | < 0.1% |
| 5.939 | 3 | < 0.1% |
| 5.917 | 3 | < 0.1% |
| Distinct | 33211 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 965061 |
| Missing (%) | 47.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.416765039 |
| Minimum | 0.3398 |
|---|---|
| Maximum | 3.314 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.5 MiB |
Quantile statistics
| Minimum | 0.3398 |
|---|---|
| 5-th percentile | 0.7864 |
| Q1 | 1.0118 |
| median | 1.3392 |
| Q3 | 1.6717 |
| 95-th percentile | 2.4186 |
| Maximum | 3.314 |
| Range | 2.9742 |
| Interquartile range (IQR) | 0.6599 |
Descriptive statistics
| Standard deviation | 0.5018230501 |
|---|---|
| Coefficient of variation (CV) | 0.3542034397 |
| Kurtosis | -0.2372562475 |
| Mean | 1.416765039 |
| Median Absolute Deviation (MAD) | 0.3293 |
| Skewness | 0.7281972133 |
| Sum | 1511604.708 |
| Variance | 0.2518263736 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.96 | 3063 | 0.2% |
| 0.97 | 2691 | 0.1% |
| 1.03 | 2682 | 0.1% |
| 0.95 | 2214 | 0.1% |
| 0.9899 | 2189 | 0.1% |
| 0.9999 | 2115 | 0.1% |
| 0.9799 | 1994 | 0.1% |
| 1 | 1787 | 0.1% |
| 1.08 | 1740 | 0.1% |
| 1.07 | 1690 | 0.1% |
| Other values (33201) | 1044776 | |
| (Missing) | 965061 |
| Value | Count | Frequency (%) |
| 0.3398 | 1 | |
| 0.34951 | 1 | |
| 0.4271 | 1 | |
| 0.43204 | 1 | |
| 0.4346 | 1 | |
| 0.436 | 1 | |
| 0.44 | 1 | |
| 0.449 | 1 | |
| 0.4563 | 2 | |
| 0.45631 | 1 |
| Value | Count | Frequency (%) |
| 3.314 | 2 | |
| 3.2514 | 2 | |
| 3.2126 | 2 | |
| 3.1945 | 1 | |
| 3.1472 | 2 | |
| 3.1404 | 2 | |
| 3.1243 | 1 | |
| 3.1221 | 1 | |
| 3.1012 | 2 | |
| 3.0754 | 2 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| R$ / litro |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 20320020 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | R$ / litro |
|---|---|
| 2nd row | R$ / litro |
| 3rd row | R$ / litro |
| 4th row | R$ / litro |
| 5th row | R$ / litro |
Common Values
| Value | Count | Frequency (%) |
| R$ / litro | 2032002 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| r | 2032002 | |
| 2032002 | ||
| litro | 2032002 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4064004 | ||
| R | 2032002 | |
| $ | 2032002 | |
| / | 2032002 | |
| l | 2032002 | |
| i | 2032002 | |
| t | 2032002 | |
| r | 2032002 | |
| o | 2032002 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10160010 | |
| Space Separator | 4064004 | 20.0% |
| Uppercase Letter | 2032002 | 10.0% |
| Currency Symbol | 2032002 | 10.0% |
| Other Punctuation | 2032002 | 10.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 2032002 | |
| i | 2032002 | |
| t | 2032002 | |
| r | 2032002 | |
| o | 2032002 |
Space Separator
| Value | Count | Frequency (%) |
| 4064004 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 2032002 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 2032002 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 2032002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12192012 | |
| Common | 8128008 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| R | 2032002 | |
| l | 2032002 | |
| i | 2032002 | |
| t | 2032002 | |
| r | 2032002 | |
| o | 2032002 |
Common
| Value | Count | Frequency (%) |
| 4064004 | ||
| $ | 2032002 | |
| / | 2032002 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 20320020 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4064004 | ||
| R | 2032002 | |
| $ | 2032002 | |
| / | 2032002 | |
| l | 2032002 | |
| i | 2032002 | |
| t | 2032002 | |
| r | 2032002 | |
| o | 2032002 |
| Distinct | 142 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.5 MiB |
| BRANCA | |
|---|---|
| PETROBRAS DISTRIBUIDORA S.A. | |
| RAIZEN | |
| IPIRANGA | |
| COSAN LUBRIFICANTES | |
| Other values (137) |
Length
| Max length | 28 |
|---|---|
| Median length | 6 |
| Mean length | 10.68584972 |
| Min length | 2 |
Characters and Unicode
| Total characters | 21713668 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 8 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 3 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 17 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | PETROBRAS DISTRIBUIDORA S.A. |
|---|---|
| 2nd row | RAIZEN |
| 3rd row | BRANCA |
| 4th row | PETROBRAS DISTRIBUIDORA S.A. |
| 5th row | IPIRANGA |
Common Values
| Value | Count | Frequency (%) |
| BRANCA | 849055 | |
| PETROBRAS DISTRIBUIDORA S.A. | 338909 | 16.7% |
| RAIZEN | 264267 | 13.0% |
| IPIRANGA | 252337 | 12.4% |
| COSAN LUBRIFICANTES | 99969 | 4.9% |
| CBPI | 94713 | 4.7% |
| ALESAT | 32705 | 1.6% |
| ALE COMBUSTΓõ£├¼VEIS | 16168 | 0.8% |
| PETROSUL | 13927 | 0.7% |
| LIQUIGΓõ£├╝S | 11771 | 0.6% |
| Other values (132) | 58181 | 2.9% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| branca | 849055 | |
| distribuidora | 341323 | |
| petrobras | 338909 | 11.9% |
| s.a | 338909 | 11.9% |
| raizen | 264267 | 9.3% |
| ipiranga | 252337 | 8.9% |
| cosan | 101719 | 3.6% |
| lubrificantes | 99969 | 3.5% |
| cbpi | 94713 | 3.3% |
| alesat | 32705 | 1.1% |
| Other values (155) | 135387 | 4.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 3818500 | |
| R | 2893780 | |
| I | 2156796 | |
| B | 1753068 | 8.1% |
| N | 1587053 | 7.3% |
| S | 1331735 | 6.1% |
| C | 1173062 | 5.4% |
| T | 866574 | 4.0% |
| O | 833527 | 3.8% |
| E | 831129 | 3.8% |
| Other values (28) | 4468444 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 20087714 | |
| Space Separator | 817851 | 3.8% |
| Other Punctuation | 691965 | 3.2% |
| Other Symbol | 41113 | 0.2% |
| Lowercase Letter | 28891 | 0.1% |
| Currency Symbol | 28775 | 0.1% |
| Other Number | 16285 | 0.1% |
| Modifier Symbol | 1074 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 3818500 | |
| R | 2893780 | |
| I | 2156796 | |
| B | 1753068 | |
| N | 1587053 | |
| S | 1331735 | 6.6% |
| C | 1173062 | 5.8% |
| T | 866574 | 4.3% |
| O | 833527 | 4.1% |
| E | 831129 | 4.1% |
| Other values (17) | 2842490 |
Other Symbol
| Value | Count | Frequency (%) |
| ├ | 28775 | |
| ╝ | 11773 | |
| ┤ | 565 | 1.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 691952 | |
| ' | 13 | < 0.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| õ | 28775 | |
| ó | 116 | 0.4% |
Space Separator
| Value | Count | Frequency (%) |
| 817851 |
Currency Symbol
| Value | Count | Frequency (%) |
| £ | 28775 |
Other Number
| Value | Count | Frequency (%) |
| ¼ | 16285 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 1074 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 20087830 | |
| Common | 1597063 | 7.4% |
| Greek | 28775 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 3818500 | |
| R | 2893780 | |
| I | 2156796 | |
| B | 1753068 | |
| N | 1587053 | |
| S | 1331735 | 6.6% |
| C | 1173062 | 5.8% |
| T | 866574 | 4.3% |
| O | 833527 | 4.1% |
| E | 831129 | 4.1% |
| Other values (18) | 2842606 |
Common
| Value | Count | Frequency (%) |
| 817851 | ||
| . | 691952 | |
| £ | 28775 | 1.8% |
| ├ | 28775 | 1.8% |
| ¼ | 16285 | 1.0% |
| ╝ | 11773 | 0.7% |
| ` | 1074 | 0.1% |
| ┤ | 565 | < 0.1% |
| ' | 13 | < 0.1% |
Greek
| Value | Count | Frequency (%) |
| Γ | 28775 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21569793 | |
| None | 102762 | 0.5% |
| Box Drawing | 41113 | 0.2% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 3818500 | |
| R | 2893780 | |
| I | 2156796 | |
| B | 1753068 | |
| N | 1587053 | 7.4% |
| S | 1331735 | 6.2% |
| C | 1173062 | 5.4% |
| T | 866574 | 4.0% |
| O | 833527 | 3.9% |
| E | 831129 | 3.9% |
| Other values (19) | 4324569 |
None
| Value | Count | Frequency (%) |
| Γ | 28775 | |
| õ | 28775 | |
| £ | 28775 | |
| ¼ | 16285 | |
| ó | 116 | 0.1% |
| Ò | 36 | < 0.1% |
Box Drawing
| Value | Count | Frequency (%) |
| ├ | 28775 | |
| ╝ | 11773 | |
| ┤ | 565 | 1.4% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.
First rows
| Semestre | ΓêÒΓòùΓõÉRegiao - Sigla | Estado - Sigla | Municipio | Revenda | CNPJ da Revenda | Nome da Rua | Numero Rua | Complemento | Bairro | Cep | Produto | Data da Coleta | Valor de Venda | Valor de Compra | Unidade de Medida | Bandeira | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2020-01 | SE | SP | GUARULHOS | AUTO POSTO SAKAMOTO LTDA | 49.051.667/0001-02 | RODOVIA PRESIDENTE DUTRA | S/N | KM 210,5-SENT SP/RJ | BONSUCESSO | 07178-580 | ETANOL | 03/01/2020 | 3.199 | NaN | R$ / litro | PETROBRAS DISTRIBUIDORA S.A. |
| 1 | 2020-01 | SE | SP | ADAMANTINA | REDE GAZOLI AUTO POSTO LTDA. | 09.116.143/0001-38 | AVENIDA MARECHAL CASTELO BRANCO | 15 | NaN | VILA JAMIL DE LIMA | 17800-000 | ETANOL | 02/01/2020 | 2.790 | NaN | R$ / litro | RAIZEN |
| 2 | 2020-01 | SE | SP | ADAMANTINA | AUTO POSTO CARREIRO LTDA | 55.451.876/0001-46 | AVENIDA CAP JOSE A DE OLIVEIRA | 160 | NaN | CENTRO | 17800-000 | ETANOL | 02/01/2020 | 2.840 | 2.5895 | R$ / litro | BRANCA |
| 3 | 2020-01 | SE | SP | ADAMANTINA | AUTO POSTO PROGRESSO DE ADAMANTINA LTDA | 52.605.052/0001-95 | AVENIDA RIO BRANCO | 764 | NaN | CENTRO | 17800-000 | ETANOL | 02/01/2020 | 2.849 | NaN | R$ / litro | PETROBRAS DISTRIBUIDORA S.A. |
| 4 | 2020-01 | SE | SP | ADAMANTINA | MARCIO A SPOSITO TRANSPORTES LTDA | 54.187.588/0002-44 | AVENIDA RIO BRANCO | 1625 | NaN | VILA INDUSTRIAL | 17800-000 | ETANOL | 02/01/2020 | 2.730 | NaN | R$ / litro | IPIRANGA |
| 5 | 2020-01 | SE | SP | ADAMANTINA | MAVESA MATUOKA VEICULOS LTDA | 43.001.569/0002-65 | AVENIDA RIO BRANCO | 600 | NaN | CENTRO | 17800-000 | ETANOL | 02/01/2020 | 2.790 | 2.6013 | R$ / litro | IPIRANGA |
| 6 | 2020-01 | SE | SP | AMPARO | AUTO POSTO DBV LTDA | 09.371.227/0001-18 | AVENIDA ANESIO GUIDI | 344 | ESQ.R.RACHID KASSOUF | DISTRITO TRES PONTES | 13909-000 | ETANOL | 02/01/2020 | 3.099 | 2.6887 | R$ / litro | PETROBRAS DISTRIBUIDORA S.A. |
| 7 | 2020-01 | SE | SP | AMPARO | AUTO POSTO PORTAL DAS AGUAS LTDA | 08.772.232/0001-70 | AVENIDA WALDYR BEIRA | 182 | NaN | FIGUEIRA | 13904-452 | ETANOL | 02/01/2020 | 2.999 | NaN | R$ / litro | PETROBRAS DISTRIBUIDORA S.A. |
| 8 | 2020-01 | SE | SP | AMPARO | J M ANDRETA & CIA LTDA | 48.827.125/0001-16 | AVENIDA BERNARDINO DE CAMPOS | 535 | NaN | CENTRO | 13900-400 | ETANOL | 02/01/2020 | 2.990 | 2.6836 | R$ / litro | IPIRANGA |
| 9 | 2020-01 | SE | SP | AMPARO | J M ANDRETA & CIA LTDA | 48.827.125/0002-05 | RUA BENTA MARIA DE BARROS | 181 | NaN | ARCADAS | 13908-000 | ETANOL | 02/01/2020 | 3.090 | 2.7627 | R$ / litro | IPIRANGA |
Last rows
| Semestre | ΓêÒΓòùΓõÉRegiao - Sigla | Estado - Sigla | Municipio | Revenda | CNPJ da Revenda | Nome da Rua | Numero Rua | Complemento | Bairro | Cep | Produto | Data da Coleta | Valor de Venda | Valor de Compra | Unidade de Medida | Bandeira | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2031992 | 2016-01 | SE | SP | CARAGUATATUBA | AUTO POSTO MORRO SANTO ANTONIO LTDA | 15.753.624/0001-57 | AVENIDA MARGINAL | 6155 | NaN | MASSAGUACU | 11677-000 | ETANOL | 27/06/2016 | 2.694 | 2.1676 | R$ / litro | IPIRANGA |
| 2031993 | 2016-01 | SE | SP | GUARULHOS | CENTRO AUTOMOTIVO GUARUMON LTDA | 18.879.915/0001-84 | RUA OCTAVIO BRAGA DE MESQUITA | S/N | NaN | JARDIM IPANEMA | 07140-020 | ETANOL | 27/06/2016 | 2.199 | 1.9313 | R$ / litro | RAIZEN |
| 2031994 | 2016-01 | SE | SP | POA | AUTO POSTO PADRE EUSTAQUIO LTDA | 19.644.825/0001-77 | AVENIDA VINTE E SEIS DE MARCO | 16 | NaN | CENTRO | 08562-140 | ETANOL | 27/06/2016 | 2.099 | 1.8499 | R$ / litro | BRANCA |
| 2031995 | 2016-01 | SE | SP | SAO PAULO | AUTO POSTO TREVO DA SORTE LTDA | 18.765.781/0001-70 | RUA IBITIRAMA | 704 | NaN | VILA PRUDENTE | 03133-100 | ETANOL | 28/06/2016 | 2.199 | 1.9871 | R$ / litro | IPIRANGA |
| 2031996 | 2016-01 | SE | SP | ITANHAEM | AUTO POSTO BELAS ARTES III LTDA | 19.713.605/0001-58 | RUA NESTOR LEAL | 49 | ESQ. COM GENTIL PEREZ | CENTRO | 11740-000 | ETANOL | 27/06/2016 | 2.649 | 2.1832 | R$ / litro | IPIRANGA |
| 2031997 | 2016-01 | SE | SP | BIRIGUI | COMERCIAL CURI PANDINI LTDA - EPP | 04.238.771/0006-87 | AV. YOUSSEF ISMAIL MANSOUR - ZE TURCO | 169 | NaN | JARDIM ALTO DOS SILVARES | 16202-484 | ETANOL | 27/06/2016 | 1.999 | NaN | R$ / litro | BRANCA |
| 2031998 | 2016-01 | SE | SP | JOSE BONIFACIO | AUTO POSTO TMJ - JOSE BONIFACIO LTDA. | 18.921.535/0001-60 | AVENIDA JOAQUIM MOREIRA DA SILVA | 2875 | NaN | JARDIM PANORAMA | 15200-000 | ETANOL | 27/06/2016 | 1.989 | 1.8010 | R$ / litro | BRANCA |
| 2031999 | 2016-01 | SE | SP | FRANCA | AUTO POSTO DISTRITO BEIRA RIO LTDA | 07.236.625/0001-04 | AVENIDA DOUTOR SEVERINO TOSTES MEIRELLES | 2,02 | NaN | DISTRITO INDUSTRIAL | 14406-004 | ETANOL | 27/06/2016 | 2.395 | NaN | R$ / litro | PETROBRAS DISTRIBUIDORA S.A. |
| 2032000 | 2016-01 | SE | SP | BRAGANCA PAULISTA | USSEN ALI CHAHIME AUTO POSTO EIRELI | 06.107.661/0001-05 | AVENIDA DOS IMIGRANTES | 4133 | NaN | MATADOURO | 12910-341 | ETANOL | 27/06/2016 | 2.299 | NaN | R$ / litro | BRANCA |
| 2032001 | 2016-01 | SE | SP | SANTO ANDRE | PATTO ROSA AUTO POSTO LTDA | 19.842.325/0001-40 | AVENIDA UTINGA | 865 | NaN | VILA METALΓõ£├£RGICA | 09220-611 | ETANOL | 27/06/2016 | 2.197 | 1.8099 | R$ / litro | BRANCA |
Most frequently occurring
| Semestre | ΓêÒΓòùΓõÉRegiao - Sigla | Estado - Sigla | Municipio | Revenda | CNPJ da Revenda | Nome da Rua | Numero Rua | Complemento | Bairro | Cep | Produto | Data da Coleta | Valor de Venda | Valor de Compra | Unidade de Medida | Bandeira | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2005-01 | SE | SP | MOGI MIRIM | AUTO POSTO GUAΓõ£├ºU MIRIM LTDA | 58.861.238/0001-91 | RUA PADRE ROQUE | 2348 | NaN | CENTRO | 13800-207 | ETANOL | 02/05/2005 | 1.130 | NaN | R$ / litro | BRANCA | 5 |
| 1 | 2005-01 | SE | SP | MOGI MIRIM | AUTO POSTO GUAΓõ£├ºU MIRIM LTDA | 58.861.238/0001-91 | RUA PADRE ROQUE | 2348 | NaN | CENTRO | 13800-207 | ETANOL | 06/05/2005 | 1.130 | NaN | R$ / litro | BRANCA | 5 |
| 2 | 2005-01 | SE | SP | SERTAOZINHO | ROSAC COMERCIO DE DERIV DE PETROLEO LTD | 01.139.257/0001-91 | AVENIDA ANTONIO PASCHOAL | 69 | NOVA SERTAOZINHO | 14160-000 | ETANOL | 03/06/2005 | 0.999 | 0.80083 | R$ / litro | CBPI | 5 | |
| 3 | 2005-01 | SE | SP | SERTAOZINHO | ROSAC COMERCIO DE DERIV DE PETROLEO LTD | 01.139.257/0001-91 | AVENIDA ANTONIO PASCHOAL | 69 | NOVA SERTAOZINHO | 14160-000 | ETANOL | 30/05/2005 | 0.999 | 0.80083 | R$ / litro | CBPI | 5 | |
| 4 | 2005-01 | SE | SP | VOTUPORANGA | VITORIA COMERCIO DE COMBUSTIVEIS DE VOTUPORANGA LTDA | 02.472.382/0001-81 | AVENIDA BRASIL | 4810 | NaN | JARDIM SAO JUDAS TADEU | 15500-051 | ETANOL | 03/06/2005 | 0.999 | 0.71997 | R$ / litro | BRANCA | 4 |
| 5 | 2005-01 | SE | SP | VOTUPORANGA | VITORIA COMERCIO DE COMBUSTIVEIS DE VOTUPORANGA LTDA | 02.472.382/0001-81 | AVENIDA BRASIL | 4810 | NaN | JARDIM SAO JUDAS TADEU | 15500-051 | ETANOL | 30/05/2005 | 0.999 | 0.71997 | R$ / litro | BRANCA | 4 |